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SATURATION: AN EFFICIENT ITERATION STRATEGY FOR 
SYMBOLIC STATE-SPACE GENERATION* 


GIANFRANCO CIARDO+, GERALD LUTTGEN* , AND RADU SIMINICEANUt 

Abstract. This paper presents a novel algorithm for generating state spaces of asynchronous systems 
using Multi-valued Decision Diagrams. In contrast to related work, the next-state function of a system 
is not encoded as a single Boolean function, but as cross-products of integer functions. This permits the 
application of various iteration strategies to build a system’s state space. In particular, this paper introduces 
a new elegant strategy, called saturation , and implements it in the tool SMART. On top of usually performing 
several orders of magnitude faster than existing BDD-based state-space generators, the algorithm’s required 
peak memory is often close to the final memory needed for storing the overall state spaces. 

Key words, iteration strategy, multi-valued decision diagrams, saturation, state-space generation 

Subject classification. Computer Science 

1. Introduction. State-space generation is one of the most fundamental challenges for many formal 
verification tools, such as model checkers [16]. The high complexity of today’s digital systems requires 
constructing and storing huge state spaces in the relatively small memory of a workstation. One research 
direction widely pursued in the literature suggests the use of decision diagrams, usually Binary Decision 
Diagrams [8] (BDDs), as a data structure for implicitly representing large sets of states in a compact fashion. 
This proved to be very successful for the verification of synchronous digital circuits, as it increased the 
manageable sizes of state spaces from about 10 6 states, with traditional explicit state-space generation 
techniques [18, 19], to about 10 20 states [10]. Unfortunately, symbolic techniques are known not to work 
well for asynchronous systems, such as communication protocols, which suffer from state-space explosion. 

The latter problem was addressed in previous work by the authors in the context of state-space gen- 
eration using Multi-valued Decision Diagrams [25] (MDDs), which exploited the fact that, in event-based 
asynchronous systems, each event updates just a few components of a system’s state vector [11]. Hence, firing 
an event requires only the application of local next-state functions and the local manipulation of MDDs. 
This is in contrast to classic BDD-based techniques which construct state spaces by iteratively applying a 
single, global next-state function which is itself encoded as a BDD [28]. Additionally, in most concurrency 
frameworks including Petri nets [31] and process algebras [5], next-state functions satisfy a product form 
allowing each component of the state vector to be updated somewhat independently of the others. Experi- 
mental results implementing these ideas of locality showed significant improvements in speed and memory 
consumption when compared to other state-space generators [30] . 
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In this paper we take our previous approach a significant step further by observing that the reachable 
state space of a system can be built by firing the system’s events in any order, as long as every event is 
considered often enough [21]. We exploit this freedom by proposing a novel strategy which exhaustively fires 
all events affecting a given MDD node, thereby bringing it to its final saturated shape. Moreover, nodes are 
considered in a depth-first fashion, i.e., when a node is processed, all its descendants are already saturated. 
The resulting state-space generation algorithm is not only concise, but also allows for an elegant proof 
of correctness. Compared to our previous work [11], saturation eliminates much administration overhead, 
reduces the average number of firing events, and enables a simpler and more efficient cache management. 

We implemented the new algorithm in the tool SMART [12], and experimental studies indicate that 
it performs on average about one order of magnitude faster than our old algorithm and several orders of 
magnitude faster than other existing state-space generators [11]. Even more important and in contrast to 
related work, the peak memory requirements of our algorithm are often close to its final memory requirements. 
In the case of the well-known dining philosophers’ problem, we are able to construct the associated state 
space of about 10 627 states, for 1000 philosophers, in under 1 second on a 800 MHz Pentium III PC using only 
390KB of memory. Our results imply that future state-based verification tools will be able to handle much 
larger asynchronous systems than is currently possible and will also provide faster feedback to engineers. 

The remainder of this paper is organized as follows. The next section introduces our formal framework 
and notation, including MDDs. Section 3 then presents our idea of node saturation as well as our novel state- 
space generation algorithm. Some implementation details are discussed in Section 4, and our algorithm is 
evaluated in Section 5 by applying it to a suite of asynchronous system models. Finally, related work is 
surveyed in Section 6, while Section 7 contains our conclusions and directions for future work. 

2. MDDs for Encoding Structured State Spaces. A discrete-state system model expressed in a 
high-level formalism must specify three objects: (i) S, the set of potential states describing the “type” of 
states; (ii) s € S, the initial state of the system; and (iii) M : S — > 2 s , the next-state function, describing 
which states can be reached from a given state in a single system step. In many cases, such as Petri nets and 
process algebras, a model expresses this function as a union M = (J e€f -^e, w ^ere £ is a finite set of events 
and A'e is the next-state function associated with event e. We say that Af e (s) is the set of states the system 
can enter when event e occurs, or fires, in state s. Moreover, event e is called disabled in s if A f e (s) = 0; 
otherwise, it is enabled. 

The reachable state space S C S of the model under consideration is the smallest set containing the 
initial system state s and being closed with respect to J\f, i.e., S = {s} U A/”(s) U Af(Af(s)) U • • • = A/”*(s), 
where denotes reflexive and transitive closure. When M is composed of several functions Af e , for e G £, 
we can iterate these functions in any order, as long as we consider each A f e often enough. This results in 
“chaotic” fixed point iterations, which are known to yield the desired fixed point, i.e., the reachable state 
space S [21]. In other words, i G 5 if and only if it can be reached from s through zero or more event firings. 
In this paper we assume that S is finite; however, for most practical asynchronous systems, the size of S is 
enormous due to the state-space explosion problem. 

2.1. Multi-valued Decision Diagrams. One way to cope with this problem is to use efficient data 
structures to encode S. This is usually possible when the system has some structure. We consider the 
common case in asynchronous system design, where a system model is composed of K submodels, for some 
K G N, so that a global system state is a A"-tuple (i K , . . . fi 1 ), where i k is the local state for submodel k. (We 
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<s 4 
s 3 
<s 2 
s 1 



<S = {1000, 1010,1100, 
1110 , 1210 , 2000 , 
2010 , 2100 , 2110 , 
2210,3010, 3110, 
3200,3201,3202, 
3210,3211,3212} 


Fig. 2.1. j4n example MDD and the state space S encoded by it. 

use superscripts for submodel indexes — not for exponentiation — and subscripts for event indexes.) Thus, 
S = S K x ■ ■■ x iS 1 , with each local state space S k having some finite size n k . In Petri nets, for example, the 
set of places can be partitioned into K subsets, and the marking can be written as the composition of the K 
corresponding submarkings. When identifying S k with the initial integer interval {0, . . . ,n k — 1}, for each 
K > k > 1, one can encode S C S via a ( quasi-reduced ordered ) MDD, i.e., a directed acyclic graph where: 

• Nodes are organized into K + 1 levels. We write (k.p) to denote a generic node, where k is the level 
and p is a unique index for that level. Level K contains only a single non-terminal node ( K.r ), the 
root , whereas levels K — 1 through 1 contain one or more non-terminal nodes. Level 0 consists of 
two terminal nodes, (0.0) and (0.1). (We use boldface for indexes 0 or 1 because they have a special 
meaning, as we will explain later.) 

• A non-terminal node (k.p) has n k arcs pointing to nodes at level fc-1. If the i th arc, for i 6 S k , 
is to node (k—l.q), we write (k.p)[i] = q. Unlike in the original BDD setting [8, 9], we allow for 
redundant nodes, having all arcs pointing to the same node. This will be convenient for our purposes, 
as eliminating such nodes would lead to arcs spanning multiple levels. 

• A non-terminal node cannot duplicate (i.e., have the same pattern of arcs as) another node at the 
same level. 


Given a node (k.p), we can recursively define the node reached from it through any integer sequence 7 =df 
(i k ,i k_1 , ■■ ■ ,i l ) € S k x 5 fe_1 x • • • x S l of length k — l + 1, for K > k > l > 1, as 


node((k.p), 7) 


(k.p) if 7 = (), the empty sequence 

node((k—l.q),6) if 7 = (i k ,6) and ( k.p)[i k ] = q. 


The substates encoded by p or reaching p are then, respectively, 


B((k.p )) = {p & S k x ■ ■ ■ x S 1 : node((k.p), P) = (0.1)} “below” (k.p ) ; 

A((k.p)) = {a G S K x • - - x <S fc+1 : node((K.r),a ) = (k.p)} “above” (k.p) . 


Thus, B((k.p )) contains the substates that, prefixed by a substate in A((k.p)), form a (global) state encoded 
by the MDD. We reserve the indexes 0 and 1 at each level k to encode the sets 0 and S k x- ■ - xS 1 , respectively. 
In particular, £((0.0)) = 0 and £((0.1)) = {()}. Figure 2.1 shows a four-level example MDD and the set S 
encoded by it; only the highlighted nodes are actually stored. 


Many algorithms for generating the state space S using BDDs have been proposed [28], and adapting 
them to MDDs is straightforward. However, a key difference in our new approach is that we do not encode 
the next-state function as an MDD over 2 K variables, recording the K state components before and after 
a system step. Instead, we explicitly and efficiently update MDD nodes directly, adding the new states 
reached through one step of the global next-state function when firing a given event. For asynchronous 
system models, this function is often expressible as the cross-product of local next-state functions. 


3 




2.2. Product— form Behavior. An asynchronous system model exhibits such behavior if, for each 
event e, its next-state function M e can be written as a cross-product of K local functions, i.e., M e = 
J\[^ x • • • x M} where M* : S k — ¥ 2 sk , for all K > k > 1. This requirement is quite natural for two reasons. 
First, many modeling formalisms satisfy it, e.g., any Petri net model conforms to this behavior for any 
partition of its places. Second, if a given model does not respect the product-form behavior, we can always 
coarsen K or refine £ so that it does. As an example, consider a model partitioned into four submodels, 
where M e = Ml x Ml' 2 x M}, but M 3 ' 2 : S 3 xS 2 — 2 sS xS2 cannot be expressed as a product Ml x Ml- We 
can achieve the product-form requirement by simply partitioning the model into three, not four, submodels. 
Alternatively, we may substitute event e with “subevents” satisfying the product form. This is possible 
since, in the worst case, we can define a subevent aj, for each i = (i 3 ,i 2 ) and j = {j 3 ,j 2 ) G Ml' 2 (i), 
with M eij (i 3 ) = {j 3 } and M ei j (i 2 ) = {j 2 }- Of course, carrying this argument too far leads to explicit 
representations, where K = 1 or where every state-to-state transition corresponds to a different event. 
However, this did not happen in the numerous asynchronous systems we considered in our studies. 

Finally, we introduce some notational conventions. We say that event e depends on level k, if the local 
state at level k does affect the enabling of e or if it is changed by the firing of e. Let First (e) and Last(e ) be 
the first and last levels on which event e depends. Events e such that First (e) = Last(e) = k are said to be 
local events and can be merged into a single macro-event \ k without violating the product-form requirement, 
since one can write M x u = M$ x • • • x M^ where M k „ = (J {e .. Flrst(e)=Last{e)=k} • A 4*» while (*') = 0'} for 
l ^ k and i l € S l . The set {e € £ : First (e) = k} of events “starting” at level k is denoted by £ k . We also 
extend M e to substates instead of full states: M e ((i k , . . . ,i 1 )) = Ml(i k ) x • • • x M l e (i l ), for K > k > l > 1; to 
sets of states: M e (X) = M e {i), for X C S k x • • • x S l \ and to sets of events: M?(X) = (j e€ jrM e (X), 
for F C £. In particular, we write M<k for M{ e: pirst\e)<k}- 

3. A Novel Algorithm Employing Node Saturation. Recall that we describe the behavior of an 
event-based asynchronous system using a product-form next-state function for each event. The system’s 
state space may then be built by iterating these functions in any order, as long as each is considered often 
enough [21], i.e., until no additional reachable states are found. We refer to a specific order of iteration as 
iteration strategy. Clearly, the choice of strategy influences the efficiency of state-space generation. In our 
previous work [11] we employed a naive strategy that cycled through MDDs level-by-level and fired, at each 
level k , all events e with First(e ) = k. 

As main contribution of this paper, we present a novel iteration strategy, called saturation , which not 
only simplifies our previous algorithm, but also significantly improves its time and space efficiency. The key 
idea is to fire events node-wise and exhaustively, instead of level-wise and just once per iteration. Formally, 
we say that an MDD node ( k.p } is saturated if it encodes a set of states that is a fixed point with respect to 
the firing of any event at its level or at a lower level, i.e., if B((k.p)) = M< k (B((k.p))) holds; it can easily be 
shown by contradiction that any node below node (k.p) must be saturated, too. It should be noted that the 
routine for firing some event, in order to reveal and add globally reachable states to the MDD-representation 
of the state space under construction, is similar to [11]. In particular, MDDs are only locally manipulated 
with respect to the levels on which the fired event depends, and, due to the product-form behavior, these 
manipulations can be carried out very efficiently. We do not further comment on these issues here, but 
concentrate solely on the new idea of node saturation and its implications. 

Just as in traditional symbolic state-space generation algorithms, we use a unique table , to detect dupli- 
cate nodes, and operation caches , in particular a union cache and a firing cache , to speed-up computation. 
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However, our approach is distinguished by the fact that only saturated nodes are checked in the unique 
table or referenced in the caches. Given the MDD encoding of the initial state s, we saturate its nodes 
bottom-up. This improves both memory and execution-time efficiency for generating state spaces because 
of the following reasons. First, our saturation order ensures that the firing of an event affecting only the 
current and possibly lower levels adds as many new states as possible. Then, since each node in the final 
encoding of S is saturated, any node we insert in the unique table has at least a chance of being still part 
of the final MDD, while any unsaturated node inserted by a traditional symbolic approach is guaranteed to 
be eventually deleted and replaced with another node encoding a larger subset of states. Finally, once we 
saturate a node at level k , we never need to fire any event e € £ k in it again, while, in classic symbolic 
approaches, Af is applied to the entire MDD at every iteration. 

In the pseudo-code of our new algorithm implementing node saturation, which is shown in Figure 3.1, 
we use the data types evnt (model event), lei (local state), Ivl (level), and idx (node index within a level); 
in practice these are simply integers in appropriate ranges. We also assume the following dynamically-sized 
global hash tables: (a) UT[k], for K > k > 1, the unique table for nodes at level k , to retrieve p given the key 
(fc.p)[0], . . . , (k.p)[n k — 1]; (b) UC[k], for K>k>l, the union cache for nodes at level k , to retrieve s given 
nodes p and q , where B((k.s)) = B((k.p)) L)B((k.q )); and (c) FC[k], for K>k> 1, the firing cache for nodes 
at level k, to retrieve s given node p and event e, where First(e) > k and B((k.s)) = N < k (Af e (B((k.p)))) . 
Furthermore, we use K dynamically-sized arrays to store nodes, so that ( k.p ) can be efficiently retrieved as 
the p th entry of the k th array. The call Generate (s) creates the MDD encoding the initial state, saturating 
each MDD node as soon as it creates it, in a bottom-up fashion. Hence, when it calls Saturate(k,r ), all 
children of ( k.r ) are already saturated. Thus, our focus for the algorithm’s correctness is on the correctness 
of Saturate and the routine RecFire invoked by it. 

Theorem 3.1 (Correctness). Consider a node (k.p) with K >k> 1 and saturated children. Moreover, 
(a) let ( l.q ) be one of its children, satisfying q ^ 0 and l = k— 1 ; (b) let U stand for B((l.q)) before the 
call RecFire(e,l,q), for some event e with l < First(e), and let V represent B((l.f)), where f is the value 
returned by this call; and (c) let X and y denote B((k.p)) before and after calling Saturate(k,p), respectively. 
Then, (*) V = X%(M e (U)) and (ii) (F = M^ k (X). 

By choosing, for node (k.p), the root (K.r) of the MDD representing the initial system state s, we obtain 
y = Af< K (B((K.r))) = A/”< K ({s}) = S, as desired. 

Proof. To prove both statements we employ a simultaneous induction on k. For the induction base, k = 1, 
we have: (i) The only possible call RecFire(e, 0,1) immediately returns 1 because of the test on l (cf. line 1). 
Then, U — V = {()} and {()} = W< 0 (W e ({()})). (ii) The call Saturate(l,p ) repeatedly explores A 1 , the only 
event in E 1 , in every local state i for which A/J (i) ^ 0 and for which (l.p)[f| is either 1 at the beginning of 
the “while C, 0” loop, or has been modified (cf. line 12) from 0 to 1, which is the value of /, hence u , since 
the call RecFire(e, 0, 1) returns 1. The iteration stops when further attempts to fire A 1 do not add any new 
state to B((l.p)). At this point, (F = Af\\(X) = A r < 1 (A’). 

For the induction step we assume that the calls to Saturate(k— 1, •) as well as to RecFire(e, l— 1, •) work 
correctly. Recall that l = k — 1. 

(i) Unlike Saturate (cf. line 14), RecFire does not add further local states to C, since it modifies “in- 
place” the new node (l.s), and not the node (l.q) describing the states from where the firing is 
explored. The call RecFire(e,l,q) can be resolved in three ways. If l < Last(e), then the returned 
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Generate ( in s:array[l..A'] of lcl)’.idx 

Build an MDD rooted at (K.r) encoding (s) and 

return r, in UT[K], 



declare r,p:idx; 
declare k:lvl; 


1. p 4= 1; 

2. for k = 1 to K do 

3. r 4= NewNode(k); (&.r)[s[fc]] 4= p\ 

4. Saturate(k,r); Gheck(k,r); 

5. p 4= r; return r; 

S'ofu rate (in k’.lvl, p:idx) 

Update ( k.p ), not in UT[k], in-place, to encode 

^< k (B((k.p))). 

declare e:evnt ; 
declare £:set of Icl; 
declare f,u:idx; 
declare i,j;ld; 
declare pCng’.bool; 

1. repeat 

2. pCng 4= false; 

3. foreach e g £ k do 

4. £ 4= Locals(e,k,p); 

5. while £ yl 0 do 

6. i 4= Pick(C); 

7. / 4= RecFire(e,k — 1, (fc.p)[*]); 

8. if / yf 0 then 

9. foreach j € (») do 

10. « 4= Union(k — 1, /, (fc.p)[j]); 

11. if «^(A:.p)[i] then 

12. (fe.p) [j] <s=m; pCng <=true; 

13. if N't? (j) ^ 0 then 

14. £^£U{j}; 

15. until pCng = false; 

Union( in k’.lvl, p’.idx, q:idx)’.idx 

Build an MDD rooted at ( k.s ), in UT[k], encoding 

B((k.p)) U B((k.q)). Return s. 

declare i;lcl; 
declare s,u;idx; 

1. if p = 1 or q = 1 then return 1; 

2. if p = 0 or p = g then return g; 

3. if g = 0 then return p; 

4. if {p, g}, s) then return s; 

5. s 4= NewNode(k); 

6. for i = 0 to n k — 1 do 

7. u 4= Union(k— 1, (fc.p)[i], (fc.g)[i]); 

8. (fc.s)[«] 4= u; 

9. Check(k,s); Insert (UC[k], {p,q}, s); 

10. return s; 


RecFire ( in e’.evnt, l:lvl, q’.idx):idx 

Build an MDD rooted at (l.s), in UT[l ], encoding 

JS<i(Afe(B((l.q)))). Return s. 



declare £:set of Id; 
declare f,u,s’.idx; 
declare i,j:lcl; 
declare sCng:bool; 

1. if l < Last(e) then return g; 

2. if Find(FC[l], {g,e}, s) then return s; 

3 . s4= NewNode(l); sCng 4= false; 

4. C 4= Locals(e,l,q); 

5. while C 0 do 

6. i 4= Pick(C); 

7. / 4= RecFire(e, l — 1, (Z.g) [»]) ; 

8. if / ^ 0 then 

9. foreach j 6 TV’i(t) do 

10. u 4= Union(l — 1, /, (/.«)[?]); 

11. if (Z.s)[j] then 

12. (Z.s)[j] 4= u; sCng 4= true; 

13. if sCng then Saturate^, s); 

14. Check(l,s); Insert(FC[l\, {g, e}, s); 

15. return s; 

Find( in tab, key, out v):bool 

If ( key,x ) is in hash table tab, set v to x and return 
true. Else, return false. 


7nser£(inout tab, in key, v) 

Insert ( key,v ) in hash table tab, if it does not contain 
an entry (key, •)• 


Locals(}n e:evnt, k’.lvl, p;idx ): set of Id 

Ret. {i€S k ;(k.p)[i\^ 0 , J\f k (i) 5^0}, the local states in 
p locally enabling e. Return 0 or {i & S k : U k (i) ± 0}, 
respectively, if p is 0 or 1. 


Pick ( inout £:set of lcl):lcl 

I Remove and return an element from C. 


NewNode ( in k’.lvl)’.idx 
Create (k.p) with arcs set to 0, return p. 


Check(\n k’.lvl, inout p’.idx) 

If (k.p), not in UT[k], duplicates ( k.q ), in UT[k], delete 
(k.p) and set p to q. Else, insert (k.p) in UT[k]. If 
<Jfc.p)[0] = ••• = (k.p)[n k — 1] = 0 or 1, delete (k.p) 
and set p to 0 or 1, since B((k.p )) is 0 or S k x • • • x S * 1 , 
respectively. 


Fig. 3.1. Pseudo-code for the node-saturation algorithm. 
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value is / = q and (U) = U for any set U\ since q is saturated, B((l.q )) = (Z3((Z.«g))) = 

Af^AfeiB^l.q)))). If l > Last(e) but RecFire has been called previously with the same pa- 
rameters, then the call Find(FC[l\, {q, e}, s) is successful. Since node q is saturated and in the 
unique table, it has not been modified further; note that in-place updates are performed only 
on nodes not yet in the unique table. Thus, the value s in the cache is still valid and can be 
safely used. Finally, we need to consider the case where the call RecFire(e,l,q) performs “real 
work.” First, a new node { l.s } is created, having all its arcs initialized to 0. We explore the firing 
of e in each state i satisfying (/.?}[*] ^ 0 and A/] e (*) ^ 0- By induction hypothesis, the recursive 
call RecFire(e,l — l,(l.q)[i]) returns Af^- l _ 1 (J\f e (B((l—l.(l.q)[i])))). Hence, when the “while C ^ 0” 
loop terminates, B((l.s)) = Ui&s' (*) x N<t-i(Afe(B((l—l.(l.q)[i])))) = A/^,_ 1 (A4(H((/.q')))) holds. 
Thus, all children of node (l.s) are saturated. According to the induction hypothesis, the call 
Saturate (l, s ) correctly saturates (l.s). Consequently, we have B((l.s)) = W<; (A/^^ (A f e (B((l.q)))) = 
A/<j(A/"e (£?((/. g)))) after the call. 

(ii) As in the base case, Saturate(k,p) repeatedly explores the firing of each event e that is locally 
enabled in i e S k , by calling RecFire(e,k— 1, (k.p) [*]) which, as shown above and since l = k — 1, 
returns J\f< k _ x (J\f e (B((k— 1. (Arp) [«])))). Further, Saturate(k,p) terminates when firing the events 
in £ k = {ei, e2 , . . . , e m } does not add any new state to B((k.p)). At this point, the set y encoded 
by (Arp) is the fixed-point of the iteration 

y (m+ 1) <= y (m) u A/< fc _ 1 (A/’e 1 (a^au- • ■yf< k - 1 (Kjy {m) )) ■ ■ ■ m, 

initialized with -4= X [21]. Hence, y = Jf* k (X), as desired. 

This completes the correctness proof of the algorithm. □ 

Figure 3.2 illustrates our saturation-based state-space generation algorithm on a small example, where 
K = 3, |53 1 = 2, |«S 2 1 = 3, and |«Si| = 3. The initial state is (0,0,0), and there are three local events l\, 1 2, 
and I3, plus two further events, e^i (depending on levels 2 and 1 ) and 6321 (depending on all levels). Their 
effects, i.e., their next-state functions, are summarized in the table at the top of Figure 3.2; the symbol 
indicates that a level does not affect an event. The MDD encoding {(0,0,0)} is displayed in Snapshot (a). 
Nodes (3.2) and (2.2) are actually created in Steps (b) and (g), respectively, but we show them from the 
beginning for clarity. The level Ivl of a node (Ivl.idx) is given at the very left of the MDD figures, whereas 
the index idx is shown to the right of each node. We use dashed lines for newly created objects, double 
boxes for saturated nodes, and shaded local states for substates enabling the event to be fired. We do not 
show nodes with index 0 nor any arcs to them. 

• Snapshots (a-b): The call Saturate( 1,2) updates node (1.2) to represent the effect of firing I*; the 
result is equal to the reserved node (1.1). 

• Snapshots (b-f): The call Saturate(2,2) fires event I 2 , adding arc (2.2) [1] to (1.1) (cf. Snapshot (c)). 
It also fires event e2i which finds the “enabling pattern” (*,0,1), with arbitrary first component, 
and starts building the result of the firing, through the sequence of calls RecFire(e 21, 1, (2.2) [0]) and 
RecFire(e 21,0, (1.1) [1]). Once node (1.3) is created and its arc (1.3) [0] is set to 1 (cf. Snapshot (d)), 
it is saturated by repeatedly firing event l\. Node (1.3) then becomes identical to node (1.1) (cf. 
Snapshot (e)). Hence, it is not added to the unique table but deleted. Returning from RecFire on 
level 1 with result (1.1), arc (2.2) [1] is updated to point to the outcome of the firing (cf. Snapshot (f)). 
This does not add any new state to the MDD, since the state set S 3 x {1} x {0} was already encoded 
in B(( 2.2)). 
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Fig. 3.2. Example of the execution of the Saturate and RecFire routines. 


• Snapshots (f-o): Once (2.2) is saturated, we call Saturate( 3,2). Local event I 3 is not enabled, but 
event 6321 is, by the pattern (0,0,0). The calls to RecFire build a. chain of nodes encoding the 
result of the firing (cf. Snapshots (g-i)). Each of them is in turn saturated (cf. Snapshots (h— j ) ) , 
causing first the newly created node (1.4) to be deleted, since it becomes equal to node (1.1), and 
second the saturated node (2.3) to be added to the MDD. The firing of 6321 (cf. Snapshot (k)) not 
only adds state (1,2, 1), but the entire subspa.ee {1} x {1,2} x S 1 , now known to be exhaustively 
explored, as node (2.3) is marked saturated. Event I 3 , which was found disabled in node (3.2) at 
the first attempt, is now enabled, and its firing calls Union( 2, (3.2) [1] , (3.2)[0]). The result is a new 
node which is found by Check to be the reserved node (2.1) (cf. Snapshot (m)). This node encoding 
S 2 x Si is added as the descendant of node (3.2) in position 0, and the former descendant (2.2) in 
that position is removed (cf. Snapshot (n)), causing it to become disconnected and deleted. Further 
attempts to fire events I 3 or e32i add no more states to the MDD, whence node (3.2) is declared 
saturated (cf. Snapshot (o)). Thus, our algorithm terminates and returns the overall state space 
({0} x <S 2 x S 1 ) U ({1} x {1,2} x S 1 ). 

To summarize, since MDD nodes are saturated as soon as they are created, each node will either be present in 
the final diagram or will eventually become disconnected, but never be modified. This reduces the amount 
of work needed to explore subspaces. Once all events in £ k are exhaustively fired in some node {k.p}, 
any further state discovered that uses ( k.p } for its encoding benefits in advance from the “knowledge” 
encapsulated in {k.p} and its descendants. 
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4. Garbage Collection, Optimizations, and Generalizations. Before evaluating our saturation al- 
gorithm by means of experimental studies, we briefly discuss some implementation details regarding garbage- 
collection policies, mention two optimizations noticeably affecting the algorithm’s performance, and remark 
on extending the algorithm to deal with multiple initial system states. 

4.1. Garbage Collection. MDD nodes can become disconnected, i.e., unreachable from the root, and 
should be “recycled.” Disconnection is detected by associating an incoming-arc counter to each node ( k.p ) 
such that (k.p) is disconnected if and only if its counter is zero. Recycling disconnected nodes is a major issue 
in traditional symbolic state-space generation algorithms, where usually many nodes become disconnected. 
In our algorithm, this phenomenon is much less frequent, and the best runtime is achieved by removing these 
nodes only at the end; we refer to this policy as Lazy policy. 

We also implemented a Strict policy where, if a node (k.p) becomes disconnected, its “delete-flag” is 
set and its arcs (k.p)[i] are re-directed to (k— 1 .0), with possible recursive effects on the nodes downstream. 
When a hit in the union cache UC[k] or the firing cache FC[k] returns s, we consider this entry stale if 
the delete-flag of node (k.s) is set. By keeping a per-level count of the nodes with delete-flag set, we can 
decide in routine NewNode(k) whether (a) to allocate new memory for a node at level k or (b) to recycle 
the indexes and the physical memory of all nodes at level k with delete-flag set, after having removed all 
the entries in UC[k] and FC[k\ referring to them. The threshold that triggers recycling can be set in terms 
of numbers of nodes or bytes of memory. The policy using a threshold of one node, denoted as Strict(I), 
is optimal in terms of memory consumption, but has a higher overhead due to frequent clean-ups. 

4.2. Optimizations. In our implementation we employ several optimizations. For example, the two 
outermost loops in Saturate ensure that firing any event e £ S k adds no new states. However, if we always 
consider these events in the same order, we can stop iterating as soon as |£*| consecutive events have been 
explored without revealing any new state. This saves |£*|/2 firing attempts on average, which translates to 
speed-ups of up to 25% in our experimental studies. Also, in Union , the call Insert(U C[k],{p, q}, s ) records 
that H((fc.s)) = B((k.p))UB((k.q)). Since this implies B((k.s)) = B((k.p))UB((k.s)) andH((fc.s)) = B((k.s)) U 
B((k.q)), we can, optionally, also issue the calls Insert(U C[h\, {p, s}, s), if s ^ p, and Insert(U C\k\, {<7, s}, s), 
if s / q. This speculative union heuristic improves performance up to 20%. 

4.3. Generalizations. So far we only discussed state-space generation starting from an MDD encoding 
a single initial system state. We also implemented an extended version of our algorithm that can compute 
M* (B((K.r))) for any arbitrary MDD rooted at node (K.r). This is of importance for adapting our ideas 
to model checking [16]. The necessary technical details underlying this issue are quite straightforward and, 
thus, are omitted here. 

5. Experimental Results. In this section we compare the performance of our new algorithm, using 
both the Strict and Lazy policies, with previous MDD-based ones, namely the traditional Recursive 
MDD approach in [30] and the level-by-level FoRWARDiNG-arcs approach in [11]. All three approaches 
are implemented in SMART [12], a tool for the logical and stochastic-timing analysis of discrete-state 
systems. For asynchronous systems, these approaches greatly outperform the more traditional BDD-based 
approaches [28], where next-state functions are encoded using decision diagrams. To evaluate our saturation 
algorithm, we have chosen a suite of examples with a wide range of characteristics. In all cases, the state 
space sizes depend on a parameter N e N. 
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Fig. 5.1. Petri nets used in our experiments: round-robin mutex protocol (upper left), dining philosophers (upper right), 
FMS (lower left), and slotted-ring (lower right). 

• The classic N queens problem requires to find a. way to position N queens on a N x N chess board 
such that they do not attack each other. Since there will be exactly one queen per row in the final 
solution, we use a safe (i.e., at most one token per place) Petri net model with N x N transitions 
and N rows, one per MDD level, of N + 1 places. For 1 < i , j < N, place Pij is initially empty, and 
place Pio contains the token (queen) still to be placed on row i of the chess board. Transition tij 
moves the queen from place Pio to place Pij, in competition with all other transitions tu, for l ^ j. 
To encode the mutual exclusion of queens on the same column or diagonal, we employ inhibitor arcs. 
A correct placement of the N queens corresponds to a. marking where all places pm are empty. Note 
that our state space contains all reachable markings, including those where queens ntoN still need 
to be placed, for any n. In this model, locality is poor, since tij depends on levels 1 through i. 

• The dining philosophers and slotted ring models [11, 33] are obtained by connecting N identical safe 
subnets “in a circle.” The MDD has N/2 MDD levels (two subnets per level) for the former model 
and N levels (one subnet per level) for the latter. Events are either local or synchronize adjacent 
subnets, thus they span only two levels, except for those synchronizing subnet N with subnet 1. 

• The round-robin mutex protocol model [23] also has N identical safe subnets placed in a circular 
fashion, which represent N processes, each mapped to one MDD level. Another subnet models a 
resource shared by the N processes, giving raise to one more level, at the bottom of the MDD. There 
are no local events and, in addition to events synchronizing adjacent subnets, the model contains 
events synchronizing levels n and 1, for 2 < n < N + 1. 
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• The flexible manufacturing system (FMS) model [30] has a fixed shape, but is parameterized by the 
initial number N of tokens in some places. We partition this model into 19 subnets, giving rise to a 
19-level MDD with a moderate degree of locality, as events span from two to six levels. 

The Petri nets for these systems, except for the queens problem, are depicted in Figure 5.1. 

Figure 5.2 compares three variants of our new algorithm, using the Lazy policy or the Strict policy 
with thresholds of 1 or 100 nodes per level, respectively, against the Recursive algorithm in [30] and the 
FORWARDING algorithm in [11]. We ran SMART on a 800 MHz Intel Pentium III PC under Linux. On the 
left of Figure 5.2, we give the size of the state space for each model and the value of N. The graphs in the 
middle and right columns show the peak and final numbers of MDD nodes and the CPU time in seconds 
required for the state-space generations, respectively. 

For the models introduced above, our new approach is up to two orders of magnitude faster than [30] 
(a speed-up factor of 384 is obtained for the 1000 dining philosophers’ model), and up to one order of 
magnitude faster than [11] (a speed-up factor of 38 is achieved for the slotted ring model with 50 slots). 
These results are observed for the Lazy variant of the algorithm, which yields the best runtimes; the Strict 
policy also outperforms [30] and [11]. Furthermore, the gap keeps increasing as we scale up the models. Just 
as important, the saturation algorithm tends to use many fewer MDD nodes, whence less memory. This is 
most apparent in the FMS model, where the difference between the peak and the final number of nodes is 
just a constant, 10, for any Strict policy. Also notable is the reduced memory consumption for the slotted 
ring model, where the Strict(I) policy uses 23 times fewer nodes compared to [30], for N = 50. In terms 
of absolute memory requirements, the number of nodes is essentially proportional to bytes of memory. For 
reference, the largest memory consumption in our experiments was recorded with 9.7MB for the FMS model 
with 100 tokens; auxiliary data structures required up to 2.5MB for encoding the next-state functions and 
200KB for storing the local state spaces, while the caches used less than 1MB. Other SMART structures 
account for another 4MB. 

In a nutshell, with respect to generation time, the best algorithm is Lazy, followed by Strict(IOO), 
Strict(I), Forwarding, and Recursive. According to memory consumption, the best algorithm is 
Strict(I), followed by Strict(IOO), Lazy, Forwarding, and Recursive. Thus, our new algorithm 
is consistently faster and uses less memory than previously proposed approaches. The worst model for all 
algorithms is the queens problem, which has a very large number of nodes in the final representation of S 
and little locality. Even here, however, our algorithm uses slightly fewer nodes and is substantially faster. 
Finally, we observe that, when the Lazy and Strict policies differ widely in terms of memory consumption 
and CPU time, the choice of threshold for the Strict policy lets us trade-off time vs. space efficiency. 

Hence, exploiting the locality inherent in asynchronous systems and employing a clever strategy for 
iterating their local next-state functions, is the key to efficiency for symbolic state-space generators. 

6. Related Work. We already pointed out the significant differences of our approach to symbolic 
state-space generation when compared to traditional approaches reported in the literature [28], which are 
usually deployed for model checking [14]. Hence, for comparing our algorithm to this work fairly, it needs to 
be extended to a full model checker first, which is currently being investigated. The following sections briefly 
survey some orthogonal and alternative approaches to improving the scalability of state-space generation 
and model-checking techniques. These approaches can be classified according to whether state spaces are 
represented either explicitly or symbolically. 
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Fig. 5.2. State space sizes, memory consumption, and generation times flogscalej. Note: The curves in the upper left 
diagram are almost identical and, thus, appear to coincide. 
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6.1. Explicit State— space Generation. Explicit techniques represent state spaces by trees, hash 
tables, or graphs, where each state corresponds to an entity of the underlying data structure. Thus, the 
memory needed to store the state space of a system is linear in the number of the system’s states. To 
achieve space efficiency, numerous techniques have been introduced, including multi-level data structures [13] 
and merging common bitvectors [24]. To avoid state-space explosion for asynchronous system models, 
researchers often employ compositional construction techniques based on context constraints [23, 26], partial- 
order techniques [22], or symmetry reduction [15]. 

6.2. Symbolic State— space Generation. Regarding synchronous hardware systems , symbolic tech- 
niques using BDDs, which can represent state spaces in sublinear space, have been thoroughly investi- 
gated [17]. Several implementations of BDDs are available. We refer the reader to [36] for a good survey on 
BDD packages and their performance. To improve the time efficiency of BDD-based algorithms, breadth- 
first BDD-manipulation algorithms [4] have been explored and compared against the traditional depth-first 
ones. However, the results show no significant speed-ups, although breadth-first algorithms lead to more 
regular access patterns of hash tables and caches. Regarding space efficiency, a fair amount of work has 
concentrated on choosing appropriate variable orderings and on dynamically re-ordering variables [20]. 

For asynchronous software systems, symbolic techniques have been investigated less, and mostly only in 
the setting of Petri nets. For safe Petri nets, BDD-based algorithms for the generation of the reachability 
set have been developed in [33, 35] via encoding each place of a net as a Boolean variable. These algorithms 
are capable of generating state spaces of large nets within hours. Recently, more efficient encodings of nets 
have been introduced, which take place invariants [32] into account, although the underlying logic is still 
based on Boolean variables. In contrast, our work uses a more general version of decision diagrams, namely 
MDDs [25, 30], where more complex information is carried in each node of a diagram. In particular, MDDs 
allow for a natural encoding of asynchronous system models, such as distributed embedded systems. 

For the sake of completeness, we briefly mention some other BDD-based techniques exploiting the 
component-based structure of many digital systems. They include partial model checking [3], compositional 
model checking [27], partial-order reduction [2], and conjunctive decompositions [29]. Finally, also note that 
approaches to symbolic verification have been developed, which do not rely on decision diagrams but instead 
on arithmetic or algebra [1, 6, 34], 

7. Conclusions and Future Work. We presented a novel approach for constructing the state spaces 
of asynchronous system models using MDDs. By avoiding to encode a given global next-state function as 
an MDD, but splitting it into several local next-state functions instead, we gained the freedom to choose the 
sequence of event firings, which controls the fixed-point iteration resulting in the desired global state space. 
Our central contribution is the development of a specific elegant iteration strategy based on saturating MDD 
nodes. Its utility is proved by experimental studies which show that our algorithm often performs several 
orders of magnitude faster than most existing algorithms. Equally important, the peak sizes of MDDs are 
usually kept close to their final sizes. 

Regarding future work, we plan to employ our idea of saturation for implementing an MDD-based CTL 
model checker within SMART [12], to compare the model checker to state-of-the-art BDD-based model 
checkers, and to test our tool on examples that are extracted from real software. Moreover, we intend to 
investigate whether our new algorithm is suitable for parallelization. 
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